-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[Offloading] Extend OffloadBinary format to support multiple metadata entries #169425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
You can test this locally with the following command:git-clang-format --diff origin/main HEAD --extensions c,h,cpp -- clang/test/Driver/linker-wrapper-image.c llvm/include/llvm/Object/OffloadBinary.h llvm/include/llvm/ObjectYAML/OffloadYAML.h llvm/lib/Frontend/Offloading/OffloadWrapper.cpp llvm/lib/Object/Binary.cpp llvm/lib/Object/OffloadBinary.cpp llvm/lib/ObjectYAML/OffloadEmitter.cpp llvm/lib/ObjectYAML/OffloadYAML.cpp llvm/tools/obj2yaml/offload2yaml.cpp --diff_from_common_commit
View the diff from clang-format here.diff --git a/llvm/include/llvm/Object/OffloadBinary.h b/llvm/include/llvm/Object/OffloadBinary.h
index e7db4a71c..5a9b4120d 100644
--- a/llvm/include/llvm/Object/OffloadBinary.h
+++ b/llvm/include/llvm/Object/OffloadBinary.h
@@ -88,7 +88,7 @@ public:
struct Header {
uint8_t Magic[4] = {0x10, 0xFF, 0x10, 0xAD}; // 0x10FF10AD magic bytes.
uint32_t Version = OffloadBinary::Version; // Version identifier.
- uint64_t Size; // Size in bytes of this entire binary.
+ uint64_t Size; // Size in bytes of this entire binary.
uint64_t EntriesOffset; // Offset in bytes to the start of entries block.
uint64_t EntriesCount; // Number of metadata entries in the binary.
};
@@ -162,7 +162,8 @@ private:
reinterpret_cast<const StringEntry *>(&Buffer[TheEntry->StringOffset]);
for (uint64_t I = 0, E = TheEntry->NumStrings; I != E; ++I) {
StringRef Key = &Buffer[StringMapBegin[I].KeyOffset];
- StringData[Key] = StringRef(&Buffer[StringMapBegin[I].ValueOffset], StringMapBegin[I].ValueSize);
+ StringData[Key] = StringRef(&Buffer[StringMapBegin[I].ValueOffset],
+ StringMapBegin[I].ValueSize);
}
}
|
🐧 Linux x64 Test Results
Failed Tests(click on a test name to see its output) ClangClang.Driver/linker-wrapper-hip-amdgcnspirv.cClang.Driver/linker-wrapper-hip-no-rdc.cClang.Driver/linker-wrapper-image.cClang.Driver/linker-wrapper.cClang.Driver/offload-packager.cIf these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the |
jhuber6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this in principle, but
jhuber6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with these changes, now we just need the few changes to make sure that everything still works with the new version. Check all the uses of these fields and we probably want an obj2yaml support for multiple entries so we can test it. Realistically you just need to update the extractOffloadFiles to additionally extract files within the same header.
| uint64_t EntrySize; // Size of the metadata entry in bytes. | ||
| uint64_t Size; // Size in bytes of this entire binary. | ||
| uint64_t EntriesOffset; // Offset in bytes to the start of entries block. | ||
| uint64_t EntriesCount; // Number of metadata entries in the binary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with changing this, it wasn't really used for anything but a sanity check during parsing
Thank you for quick feedback, Joseph, I'll continue updating this PR with the further changes. |
|
FYI: I'm now updating |
Yeah, we'll probably just want to make the interface take an array reference, possibly a helper that takes a single one and turns it into an array of one for convenience |
jhuber6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should keep the accessor methods to all take a single image, so the extract code just returns an array of these. You'd need to make sure the reference stays alive but it might be simpler that way, I don't think the interface handles this well, like for checking the image kind.
| OffloadKind getOffloadKind() const { return Entries[0].first->TheOffloadKind; } | ||
| uint32_t getVersion() const { return TheHeader->Version; } | ||
| uint32_t getFlags() const { return TheEntry->Flags; } | ||
| uint32_t getFlags() const { return Entries[0].first->Flags; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this has any users, we can and should change the interface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this getter for now.
| StringRef getArch() const { return getString("arch"); } | ||
| StringRef getImage() const { | ||
| return StringRef(&Buffer[TheEntry->ImageOffset], TheEntry->ImageSize); | ||
| return StringRef(&Buffer[Entries[0].first->ImageOffset], Entries[0].first->ImageSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want an index, can initialize it to zero for backward compat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an index but did not initialize yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we won't need an index here, that's what we can hopefully avoid
| MapVector<StringRef, StringRef> StringData; | ||
| /// Location of the metadata entries within the binary mapped to | ||
| /// the key-value string data. | ||
| SmallVector<std::pair<const Entry *, StringMap>, 1> Entries; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't decide if this should be two vectors that take the same index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If OffloadBinary objects share the same header and each corresponds to a single Entry/Image, then we do not need vectors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd want to be compatible with the appended form as well, since that occurs if people use relocatable links or other merging methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, basically, each OffloadBinary would be something like this in terms of data:
TheHeader <-- shared somehow between all OffloadBinary objects
Buffer <-- shared somehow between all OffloadBinary objects
Vector<Entry> <-- identical for all OffloadBinary objects
Vector<StringMap> <-- identical for all OffloadBinary objects
Index <-- initialized to consecutive numbers when extracting OffloadBinaries, so that getters would work as before for compatiblity.
Is this what you have in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now I think we return references, but there's an explicit copy method? In the copy method would need to copy the header, but references can just use the existing one or something. The extracted interface isn't the binary serialized one so we can probably do a little magic if needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, there is an explicit deep copy method. ok, I'll try this also.
| const StringEntry *StringMapBegin = reinterpret_cast<const StringEntry *>( | ||
| &Buffer[TheEntry->StringOffset]); | ||
| StringMap Strings; | ||
| for (uint64_t SI = 0, SE = TheEntry->NumStrings; SI != SE; ++SI) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just take a reference to the StringData and keep the old handling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand. It is old handling. We still read each StringEntry one-time. 2 loops allow to split StringEntries in accordance with Entries they belong to.
If we read just all StringEntries as before to a single StringMap for all entries, we need to keep start and end indexes to be able to know which key-value pairs correspond to which entry, is it what you suggest?
Could you please elaborate, I'm not sure I understand what you mean exactly... Maybe you can just provide a sketch of the interface that you have in mind? |
The extract interface already returns a vector of these binaries. We could keep the same extract interface so we don't need to change the get routines to index. The main difference is that these would be slices that share a header so you'd need to manage the memory or create new buffers by copying the header. |
I see, interesting idea, I'll try. |
|
I updated OffloadBinary.h in accordance with how I understand your idea so far. Not a complete of course, just a few changes to see if we are aligning our understanding. |
| const OffloadBinary::Header *Header = *HeaderOrErr; | ||
|
|
||
| switch (Header->Version) { | ||
| case 1: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version just changes the count, right? Might be easier to just initialize the count to one and set it to the proper size if the version is version 2. That way we should be able to share the code
|
|
||
| // This parsing should never fail because it has already been parsed. | ||
| auto NewBinaryOrErr = OffloadBinary::create(*Buffer); | ||
| auto NewBinaryOrErr = OffloadBinary::createV1(*Buffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need a versioned create method, everything should use the new version after this. Realistically we probably just want an overload that takes a single one and another that takes an array. The old function just calls the new one with an array of size one.
No description provided.